dataloaders
Data Loaders
1. loadData
Syntax: loadData( data, schema, options)
parameters:
- data:
required: true
type: JSON, string, Array of Array
descriptions: Input data in any of the mentioned formats. Checkout this example for practical example on how feed different data format.
- schema:
required: true
type: JSON
description : The field definitions in Object format. Order of the variables in data and order of the variables in schema has to be same.
- options:
required: false
type: Object
default: {}
description: Additional config
parameters:
- useWorker
required: false
type: boolean
default: true
description: If set to false, then worker is not used to load data.
- firstRowHeader:
required: false
type: String
default: true
descriptions: True if first row is header in case of data like dsv-str or csv
- fieldSepeartor:
required: false
type: String
default: ','
descriptions: Used only for CSV string data.
The loadData is a helper method to load data correctly for the Data model constructor. Data could be in the form of:
- Flat JSON
- DSV String
- 2D Array
By default loadData
finds suitable adapter to serialize the data.
const Datamodel = muze.DataModel;
const data = [
{
Name: "chevrolet chevelle malibu",
Miles_per_Gallon: 18,
Cylinders: 8,
Horsepower: 130,
Year: "1970",
},
{
Name: "ford fiesta",
Miles_per_Gallon: 36.1,
Cylinders: 4,
Horsepower: 66,
Year: "1978",
},
{
Name: "bmw 320i",
Miles_per_Gallon: 21.5,
Cylinders: 4,
Horsepower: 110,
Year: "1977",
},
{
Name: "chevrolet chevelle malibu",
Miles_per_Gallon: 18,
Cylinders: 8,
Horsepower: 130,
Year: "1970",
},
{
Name: "ford fiesta",
Miles_per_Gallon: 36.1,
Cylinders: 4,
Horsepower: 66,
Year: "1978",
},
{
Name: "bmw 320i",
Miles_per_Gallon: 21.5,
Cylinders: 4,
Horsepower: 110,
Year: "1977",
},
];
const schema = [
{
name: "Name",
type: "dimension",
},
{
name: "Miles_per_Gallon",
type: "measure",
},
{
name: "Cylinders",
type: "dimension",
},
{
name: "Horsepower",
type: "measure",
},
{
name: "Year",
type: "dimension",
format: "%Y",
},
];
const formattedData = await Datamodel.loadData(data, schema);
let dm = new Datamodel(formattedData);
Output:
Name | Miles_per_Gallon | Cylinders | Horsepower | Year |
---|---|---|---|---|
chevrolet chevelle malibu | 18 | 8 | 130 | 1970 |
ford fiesta | 36.1 | 4 | 66 | 1978 |
bmw 320i | 21.5 | 4 | 110 | 1977 |
chevrolet chevelle malibu | 18 | 8 | 130 | 1970 |
ford fiesta | 36.1 | 4 | 66 | 1978 |
Delimiter Separated Values
We will take a DSV file with the delimiter as |
for our example. .
Let's consider the following DSV data, which records the data of shifts in social media for teenagers in percentage of count:
Media | Year | value |
---|---|---|
Youtube | 142005060000 | NaN |
Youtube | 151474500000 | 85 |
151474500000 | 72 | |
142005060000 | 5 | |
Snapchat | 151474500000 | 69 |
We will take a CSV file (CSV is just a variation of DSV format where the delimiter is comma) for our example. We need to provide headers to the CSV, so that the schema can identify the variables from data.
const Datamodel = muze.DataModel;
const data = `Media,Year,value
Youtube,2015,null
Youtube,2018,85
Instagram,2018,72
Instagram,2015,5
Snapchat,2018,69`;
const schema = [
/* Defines the schema so that DataModel recognizes the variables from data */
{ name: "Media", type: "dimension" },
{ name: "Year", type: "dimension", subtype: "temporal", format: "%Y" },
{ name: "value", type: "measure" },
];
const formattedData = await Datamodel.loadData(data, schema);
let dm = new Datamodel(formattedData);
Printing the underlying data of dm
:
Media | Year | value |
---|---|---|
Youtube | 142005060000 | NaN |
Youtube | 151474500000 | 85 |
151474500000 | 72 | |
142005060000 | 5 | |
Snapchat | 151474500000 | 69 |
Lets load a DSV array type data:
const Datamodel = muze.DatModel;
const data = [
["Media", "Year", "value"],
["Youtube", "2015", null],
["Youtube", "2018", 85],
["Instagram", "2018", 72],
["Instagram", "2015", 5],
["Snapchat", "2018", 69],
];
const schema = [
/* Defines the schema so that DataModel recognizes the variables from data */
{ name: "Media", type: "dimension" },
{ name: "Year", type: "dimension", subtype: "temporal", format: "%Y" },
{ name: "value", type: "measure" },
];
const formattedData = await Datamodel.loadData(data, schema);
let dm = new Datamodel(formattedData);
Printing the underlying data of dm
gives us the same output as above:
Media | Year | value |
---|---|---|
Youtube | 142005060000 | NaN |
Youtube | 151474500000 | 85 |
151474500000 | 72 | |
142005060000 | 5 | |
Snapchat | 151474500000 | 69 |